Machine Translation with Binary Feedback: a Large-Margin Approach

نویسندگان

  • Avneesh Saluja
  • Ian Lane
  • Ying Zhang
چکیده

Viewing machine translation as a structured classification problem has provided a gateway for a host of structured prediction techniques to enter the field. In particular, large-margin structured prediction methods for discriminative training of feature weights, such as the structured perceptron or MIRA, have started to match or exceed the performance of existing methods such as MERT. One issue with structured problems in general is the difficulty in obtaining fully structured labels, e.g., in machine translation, obtaining reference translations or parallel sentence corpora for arbitrary language pairs. Another issue, more specific to the translation domain, is the difficulty in online training of machine translation systems, since existing methods often require bilingual knowledge to correct translation output online. We propose a solution to these two problems, by demonstrating a way to incorporate binary-labeled feedback (i.e., feedback on whether a translation hypothesis is a “good” or understandable one or not), a form of supervision that can be easily integrated in an online manner, into a machine translation framework. Experimental results show marked improvement by incorporating binary feedback on unseen test data, with gains exceeding 5.5 BLEU points.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Large-Margin Training for Statistical Machine Translation

We achieved a state of the art performance in statistical machine translation by using a large number of features with an online large-margin training algorithm. The millions of parameters were tuned only on a small development set consisting of less than 1K sentences. Experiments on Arabic-toEnglish translation indicated that a model trained with sparse binary features outperformed a conventio...

متن کامل

A Novel Approach to Trace Time-Domain Trajectories of Power Systems in Multiple Time Scales Based Flatness

This paper works on the concept of flatness and its practical application for the design of an optimal transient controller in a synchronous machine. The feedback linearization scheme of interest requires the generation of a flat output from which the feedback control law can easily be designed. Thus the computation of the flat output for reduced order model of the synchronous machine with simp...

متن کامل

Identifying Useful Human Correction Feedback from an On-Line Machine Translation Service

Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users is very noisy, and must be automatically filtered in order to identify the potentially useful cases. We present a study on automatic feedback filtering in a real weblog colle...

متن کامل

Online Relative Margin Maximization for Statistical Machine Translation

Recent advances in large-margin learning have shown that better generalization can be achieved by incorporating higher order information into the optimization, such as the spread of the data. However, these solutions are impractical in complex structured prediction problems such as statistical machine translation. We present an online gradient-based algorithm for relative margin maximization, w...

متن کامل

Optimization Strategies for Online Large-Margin Learning in Machine Translation

The introduction of large-margin based discriminative methods for optimizing statistical machine translation systems in recent years has allowed exploration into many new types of features for the translation process. By removing the limitation on the number of parameters which can be optimized, these methods have allowed integrating millions of sparse features. However, these methods have not ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012